Empirical risk minimization (ERM) and distributionally robust optimization (DRO) are popular approaches for solving stochastic optimization problems that appear in operations management and machine learning. Existing generalization error bounds for these methods depend on either the complexity of the cost function or dimension of the uncertain parameters; consequently, the performance of these methods is poor for high-dimensional problems with objective functions under high complexity. We propose a simple approach in which the distribution of uncertain parameters is approximated using a parametric family of distributions. This mitigates both sources of complexity; however, it introduces a model misspecification error. We show that this new source of error can be controlled by suitable DRO formulations. Our proposed parametric DRO approach has significantly improved generalization bounds over existing ERM / DRO methods and parametric ERM for a wide variety of settings. Our method is particularly effective under distribution shifts. We also illustrate the superior performance of our approach on both synthetic and real-data portfolio optimization and regression tasks.
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
Aleatoric不确定性量化寻求对随机响应的分配知识,这对于机器学习应用中的可靠性分析和鲁棒性改善非常重要。先前对息肉不确定性估计的研究主要针对封闭形成的条件密度或方差,这需要对数据分布或维度的强大限制。为了克服这些限制,我们研究了有条件的生成模型,以估计不确定性。我们介绍了两个指标,以测量适合这些模型的两个条件分布之间的差异。这两个指标都可以通过对条件生成模型的蒙特卡洛模拟轻松而公正地计算,从而促进其评估和培训。我们以数字方式证明了我们的指标如何提供有条件分布差异的正确度量,并可用于训练有条件的模型与现有基准有竞争力。
translated by 谷歌翻译
多代理市场仿真通常用于为下游机器学习或加强学习任务创建环境,例如在部署它们以实时交易之前培训或测试交易策略。在电子交易市场中,只有多个市场参与者的互动导致的价格或体积时间序列通常是直接可观察到的。因此,需要校准多代理市场环境,以使模拟代理的相互作用与历史相互作用导致的时间序列 - 这使得解决高度复杂的大规模优化问题。在本文中,我们提出了一种简单而有效的框架,可以从历史时间序列观测校准多代理市场模拟器参数。首先,我们考虑一个新颖的资格概念,以绕过潜在的不可识别性问题。其次,我们通过Bonferroni校正概括了两个样本的Kolmogorov-Smirnov(K-S)测试,以测试两个高维时间序列分布之间的相似性,这在时间序列样本集之间提供了一个简单但有效的距离度量。第三,我们建议使用贝叶斯优化(BO)和信任区域BO(Turbo)来最小化上述距离度量。最后,我们展示了使用数值实验的框架的效率。
translated by 谷歌翻译
稀有事件仿真技术,如重要采样(是),构成强大的工具,以加速罕见灾难性事件的具有挑战性的估算。这些技术经常利用底层系统结构的知识和分析,以赋予赋予理想的效率保证。然而,黑匣子问题,特别是来自最近AI驱动的物理系统的安全关键型应用的问题,可以从根本上破坏他们的效率担保,并在没有诊断地检测的情况下导致危险的估计。我们提出了一个框架,称为深度概率加速评估(Deep-Prae)来设计统计保障是通过转换多功能的黑匣子采样器,但可能缺乏保证,以便我们称之为放松的效率证明,允许准确估计界限。论罕见事件概率。我们介绍了深度PRAE理论,将主导点概念与稀有事件集合通过深度神经网络分类器进行了学习,并证明了其在数值例子中的有效性,包括智能驾驶算法的安全测试。
translated by 谷歌翻译
在数据驱动的优化和机器学习中获得概括界限的建立方法主要基于从经验风险最小化(ERM)中的解决方案,这些解决方案取决于假设类别的功能复杂性。在本文中,我们提出了一条替代途径,以从分布强劲的优化(DRO)获得解决方案上的这些界限,这是一个基于最坏情况分析的最新数据驱动的优化框架,以及设置为捕获统计不确定性的模棱两可的概念。与ERM中的假设类复杂性相反,我们的DRO界限取决于歧义集的几何形状及其与真实损耗函数的兼容性。值得注意的是,当将最大平均差异用作DRO距离度量标准时,我们的分析意味着对假设类别的依赖性的概括范围似乎是最小的:结合仅取决于真实的损失函数,独立于假设类中的任何其他候选者。据我们所知,这是文献中这种类型的第一个概括,我们希望我们的发现可以更好地了解DRO,尤其是其在损失最小化和其他机器学习应用方面的收益。
translated by 谷歌翻译
We investigate statistical uncertainty quantification for reinforcement learning (RL) and its implications in exploration policy. Despite ever-growing literature on RL applications, fundamental questions about inference and error quantification, such as large-sample behaviors, appear to remain quite open. In this paper, we fill in the literature gap by studying the central limit theorem behaviors of estimated Q-values and value functions under various RL settings. In particular, we explicitly identify closed-form expressions of the asymptotic variances, which allow us to efficiently construct asymptotically valid confidence regions for key RL quantities. Furthermore, we utilize these asymptotic expressions to design an effective exploration strategy, which we call Q-value-based Optimal Computing Budget Allocation (Q-OCBA). The policy relies on maximizing the relative discrepancies among the Q-value estimates. Numerical experiments show superior performances of our exploration strategy than other benchmark policies.
translated by 谷歌翻译
We introduce a novel framework to track multiple objects in overhead camera videos for airport checkpoint security scenarios where targets correspond to passengers and their baggage items. We propose a Self-Supervised Learning (SSL) technique to provide the model information about instance segmentation uncertainty from overhead images. Our SSL approach improves object detection by employing a test-time data augmentation and a regression-based, rotation-invariant pseudo-label refinement technique. Our pseudo-label generation method provides multiple geometrically-transformed images as inputs to a Convolutional Neural Network (CNN), regresses the augmented detections generated by the network to reduce localization errors, and then clusters them using the mean-shift algorithm. The self-supervised detector model is used in a single-camera tracking algorithm to generate temporal identifiers for the targets. Our method also incorporates a multi-view trajectory association mechanism to maintain consistent temporal identifiers as passengers travel across camera views. An evaluation of detection, tracking, and association performances on videos obtained from multiple overhead cameras in a realistic airport checkpoint environment demonstrates the effectiveness of the proposed approach. Our results show that self-supervision improves object detection accuracy by up to $42\%$ without increasing the inference time of the model. Our multi-camera association method achieves up to $89\%$ multi-object tracking accuracy with an average computation time of less than $15$ ms.
translated by 谷歌翻译
In this paper, we propose a novel framework dubbed peer learning to deal with the problem of biased scene graph generation (SGG). This framework uses predicate sampling and consensus voting (PSCV) to encourage different peers to learn from each other, improving model diversity and mitigating bias in SGG. To address the heavily long-tailed distribution of predicate classes, we propose to use predicate sampling to divide and conquer this issue. As a result, the model is less biased and makes more balanced predicate predictions. Specifically, one peer may not be sufficiently diverse to discriminate between different levels of predicate distributions. Therefore, we sample the data distribution based on frequency of predicates into sub-distributions, selecting head, body, and tail classes to combine and feed to different peers as complementary predicate knowledge during the training process. The complementary predicate knowledge of these peers is then ensembled utilizing a consensus voting strategy, which simulates a civilized voting process in our society that emphasizes the majority opinion and diminishes the minority opinion. This approach ensures that the learned representations of each peer are optimally adapted to the various data distributions. Extensive experiments on the Visual Genome dataset demonstrate that PSCV outperforms previous methods. We have established a new state-of-the-art (SOTA) on the SGCls task by achieving a mean of \textbf{31.6}.
translated by 谷歌翻译
Audio-Visual scene understanding is a challenging problem due to the unstructured spatial-temporal relations that exist in the audio signals and spatial layouts of different objects and various texture patterns in the visual images. Recently, many studies have focused on abstracting features from convolutional neural networks while the learning of explicit semantically relevant frames of sound signals and visual images has been overlooked. To this end, we present an end-to-end framework, namely attentional graph convolutional network (AGCN), for structure-aware audio-visual scene representation. First, the spectrogram of sound and input image is processed by a backbone network for feature extraction. Then, to build multi-scale hierarchical information of input features, we utilize an attention fusion mechanism to aggregate features from multiple layers of the backbone network. Notably, to well represent the salient regions and contextual information of audio-visual inputs, the salient acoustic graph (SAG) and contextual acoustic graph (CAG), salient visual graph (SVG), and contextual visual graph (CVG) are constructed for the audio-visual scene representation. Finally, the constructed graphs pass through a graph convolutional network for structure-aware audio-visual scene recognition. Extensive experimental results on the audio, visual and audio-visual scene recognition datasets show that promising results have been achieved by the AGCN methods. Visualizing graphs on the spectrograms and images have been presented to show the effectiveness of proposed CAG/SAG and CVG/SVG that could focus on the salient and semantic relevant regions.
translated by 谷歌翻译